# For plotting
library(ggplot2)
library(scico) # for colouring palettes
# Data manipulation
library(dplyr)In this practical we will:
- Explore tools for spatial point pattern data wrangling and visualization.
First, lets load some useful libraries:
Spatial Point processes data
In point processes we measure the locations where events occur and the coordinates of such occurrences are our data.
Point process models are probabilistic models that describe the likelihood of patterns of points that represent the random location of some event. A spatial point process is a set of locations that have been generated by some form of stochastic (random) mechanism. In other words, the point process is a random variable operating in continuous space, and we observe realisations of this variable as point patterns across space (and/or time).
Consider a fixed geographical region \(A\). The set of locations at which events occur are denoted \(\mathbf{s} = s_1,\ldots,s_n\). We let \(N(A)\) be the random variable which represents the number of events in region \(A\).
Our primary interest is in measuring where events occur, so the locations are our data. We typically assume that a spatial point pattern is generated by an unique point process over the whole study area. This means that the delimitation of the study area will affect the observed point patters.
The observed distribution of points can be described based on the intensity of points within a delimited region. We can define the (first order) intensity of a point process as the expected number of events per unit area. This can also be thought of as a measure of the density of our points. In some cases, the intensity will be constant over space (homogeneous), while in other cases it can vary by location (inhomogeneous or heterogenous).
In the next example, we will explore tools for visualizing spatial point patterns. Specifically, we will map the spatial distribution of the Ringlet butterfly in Scotland’s Cairngorms National Park (CNP).
BNM citizen science program
Citizen science initiatives have become an important source of information in ecological research, offering large volumes of species distribution data collected by volunters for multiple taxonomic groups across wide spatial and temporal scales
Butterflies for the New Millennium (BNM) is a large-scale monitoring scheme launched in the earlies 70’s to keep track of butterflies’ populations in the UK. With over 12 million butterflies sighting and more than 10,000 volunteers, this recording scheme has proven to be a successful program that has been used to assess long-term changes in the distributions of UK butterfly species.
Here we will focus on the distribution of the Ringlet butterfly species, which holds particular significance in environmental studies as one of the Habitat specialists species (UK Government, 2024). The data set consists of Ringlet butterfly presence-only records collected by volunteers in Scotland’s Cairngorms National Park (CNP).
Reading shapefiles into R
First, we load the geographical region of interest which can be downloaded here (i.e., CNP boundaries). We can use thre st_read function from the sf library to load the .shp file by specifying the directory where you downloaded the files:
library(sf)
shp_SGC <- st_read("datasets/SG_CairngormsNationalPark/SG_CairngormsNationalPark_2010.shp",quiet =T)Then, we can use appropriate CRS for the UK (i.e., EPSG code: 27700) :
shp_SGC <- shp_SGC %>% st_transform(crs = 27700)
st_crs(shp_SGC)$units[1] "m"
Notice that the spatial resolution is in meters. Let’s change the spatial units to km to make resulting distance/area values more intuitive to interpret:
shp_SGC <- st_transform(shp_SGC,gsub("units=m","units=km",st_crs(shp_SGC)$proj4string))
st_crs(shp_SGC)$units[1] "km"
We can then plot the CNP boundary as follows:
ggplot()+
geom_sf(data=shp_SGC)Creating sf spatial objects in R
Now we will read the Ringlet butterfly records which can be downloaded below:
ringlett <- read.csv("datasets/bnm_ringlett.csv")
head(ringlett) y x
1 57.58752 -2.712498
2 54.97742 -3.274879
3 54.89929 -3.771451
4 55.40323 -5.737059
5 54.91438 -3.959336
6 55.87255 -4.167174
The data set contains the longitude latitude where an observation was made. We can convert this into a spatial sf object using the st_as_sf function by declaring the columns in our data that contain the spatial coordinates:
ringlett_sf <- ringlett %>% st_as_sf(coords = c("x","y"),crs = "+proj=longlat +datum=WGS84") We can subset two sf objects with the same CRS in the same way as we subset a data frame in R. For example, if we want to subset the Ringlet butterfly occurrence records to those contained only within the CNP, we can type the following:
ringlett_CNP <- ringlett_sf[shp_SGC,] # crop to mainlandIf we plot the ringlett_CNP object along with the CNP boundary, we should then obtain a map of the occurrence records within the park:
ggplot()+
geom_sf(data=shp_SGC)+
geom_sf(data=ringlett_CNP)Reading Raster Data
We can use the terra R package to read raster files. The Scotland_elev.tiff raster contains the output of a digital elevation model for Scotland:
Once you download the raster you can read it using the rast function after specifying the path where the file has been stored. Then, we assign the same CRS as our data.
library(terra)
elevation_r <- rast("datasets/Scotland_elev.tiff")
crs(elevation_r) = crs(shp_SGC)
plot(elevation_r)We can apply different R functions to our rasters. For example, we can scale the elevation values as follows:
elevation_r <- elevation_r %>% scale()Lastly, we can crop the raster to the boundaries of our region of interest. Let’s crop the elevation raster to the CNP area using the crop function:
elev_CNP <- terra::crop(elevation_r,shp_SGC,mask=T)
plot(elev_CNP)